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Abstract. In many applications one is interested to detect certain (known) patterns in 
the mean of a process with smallest delay. Using an asymptotic framework which allows 
to capture that feature, wc study a class of appropriate sequential nonparametric kernel 
procedures under local nonparametric alternatives. Wc prove a new theorem on the con- 
vergence of the normed delay of the associated sequential detection procedure which holds 
for dependent time series under a weak mixing condition. The result suggests a simple 
procedure to select a kernel from a finite set of candidate kernels, and therefore may also 
be of interest from a practical point of view. Further, we provide two new theorems about 
the existence and an explicit representation of optimal kernels minimizing the asymptotic 
normed delay. The results are illustrated by some examples. 



Keywords: Enzyme kinetics, financial econometrics, nonparametric regression, statis- 
tical genetics, quality control. 
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Introduction 



A classical problem of sequential analysis is to detect a location shift in an univariate 
time series by a binary decision procedure. More generally, we aim at testing sequentially 
whether the deterministic drift function vanishes (in-control or null model) or is equal to an 
out-of-control or alternative model. There are various important fields where such methods 
can be apphed. We shall first briefly describe some fields of applications which motivated 
the topics discussed in this article. 

Sequential methods are applied for a long time in quality control and statistical process 
control, where interest focuses on detecting the first time point where a production process 
fails. Failures of a machine may produce jumps in the sequence of the observed quality 
characteristic, whereas wastage may result in smooth but possibly nonlinear changes of 
the mean. In recent years there has been considerable interest in methods for dependent 
time series. 

An active area is the on-line monitoring of sequential data streams from capital markets. 
Indeed, an analysts task is to detect structural changes in financial data as soon as possible 
in order to trigger actions as portfolio updates or hedges. Thus, methods designed to 
support sequential decision making are in order. 

A further potential field of application is the analysis of microarray time series data con- 
sisting of gene expression levels of genes. Down- or upregulated genes can have important 
interpretations, e.g., when characterizing cancer cells, and the sequential detection of such 
level changes from time series could be of considerable value. 

In biology sequential methods may be useful to study the temporal evolution of enzyme 
kinetics in order to detect time points where a reaction starts or exceeds a prespecified 
threshold. Often it is possible to associate certain (worst case) temporal patterns with 
phenomena, e.g., symptoms or reactions to stimuli, which are of biological interest to detect. 
In order to understand complex biological systems it may be useful to estimate such change 
points sequentially instead of applying a posteriori methods, since the behavior of the real 
biological system depends only on the past. 

A basic model to capture level changes as motivated by the above application areas is as 

follows. Suppose we are observing a possibly non-stationary stochastic process, {Y{t) : t e 

T}, in continuous or discrete time T = [0, oo) with £^|F(i)| < oo. Consider the following 
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decomposition of the process in a possibly non- homogenous drift m(i) = EY{t) and an 
error process {?< : i e T}, 

Using the terms of statistical process control, we will say that the process is in-control, if 
m.{t) — for each t & T, and we are interested in kernel control charts, i.e., sequential 
kernel-based smoothing methods, to detect the first time point where the process gets 
out-of-control. Clearly, from a testing point of view we are sequentially testing the null 
hypothesis Hq : m — against the alternative m ^ 0. A common approach to the problem 
is to define a stopping rule (stopping time), N, based on some statistic that estimates 
at each time point t a functional of {m(s) : s < t}. Having defined a stopping rule, the 
stochastic properties of the associated delay, defined as N minus the change-point, are of 
interest. 

Well-known stopping rules rely on CUSUM-, EWMA-, or Shewhart-typc control charts 
which can often be tuned for the problem at hand. These proposals are motivated by 
certain optimality criteria and have been studied extensively in the literature. First publi- 
cations are due to Page (1954, 1955), Girshick and Rubin (1952). Optimality properties of 
the CUSUM procedure in the sense of Lorden (1971), i.e., minimizing the conditional expec- 
tation of the delay given the least favorable event before the change-point was first shown 
by Moustakides (1986) and, using Bayesian arguments, by Ritov (1990, 1997) and Yakir 
(1997). For a discussion of EWMA control schemes see Schmid and Schoene (1997). Yakir, 
Krieger, and PoUak (1999) studied first order optimality of the CUSUM and Shiryayev- 
Roberts procedures to detect a change in regression. Their result deals with optimal stop- 
ping rules in the sense that the expected delay is minimal subject to a constraint on the 
average run length to a false alarm. However, that result is restricted to independent and 
normally distributed observations. A kernel-based a postcori procedure for detecting mul- 
tiple change points which is in the spirit of the present article has been studied by Huskova 
and Slaby (1997) and Grabovsky, Horvath and Huskova (2000). For reviews we refer to 
Huskova (1991) and Antoch, Huskova, and Jaruskova (2002). 

The present paper provides an asymptotic analysis with local alternatives, which holds for a 
rich class of strongly mixing dependent processes. Our stopping rule uses a Priestley-Ghao 
type kernel regression estimate which relies on a weighted sum of past observations without 
assuming knowledge of the alternative regression function or an estimate of it. The theory 
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and application of such smoothing methods is nicely described in Hart (1997) or Hardle 
(1990). However, it is important to note that framework and assumptions of the present 
paper are different from classical nonparametric regression. Whereas in nonparametric 
regression it is assumed that the bandwidth h tends to such that n/i — > oo, n denoting 
the (fixed) sample size, and maxij — ti-i — > 0, as n — > oo, our monitoring approach works 
with /i — >■ oo and ti — ti^i > A for all i. 

Sequential smoothing procedures, where a regression estimate is evaluated at the cTirrent 
observation, have been studied for various change-point problems, e.g. to monitor the 
derivative of a process mean (Schmid and Steland, 2000). Note also that they are implicitly 
applied in classical (fixed sample) nonparametric regression at the boundary. Of course, it 
is of special interest to study the simultaneous effect of both the kernel and the alternative 
drift on the asymptotic normed delay of the associated stopping rule. We prove a limit 
theorem addressing this question for general mixing processes. We then ask how to optimize 
the procedure w.r.t. the smoothing kernel for certain regression alternatives. It turns out 
that an explicit representation of the optimal kernel can be derived for arguments not 
exceeding the associated asymptotic optimal delay. For simple location shifts first results 
for the normed delay have been obtained by Brodsky and Darkhovsky (1993, 2000) for 
sequential kernel smoothers as studied here. When jumps are expected, jump-preserving 
estimators as discussed in Lee (1983), Chiu et. al (1998), Rue et al. (2002), and Pawlak 
and Rafajlowicz (2000, 2001) are an attractive alternative, since smoothers tend to smooth 
away jumps. Convergence results for the normed delay of jump-preserving stopping rules 
have been studied in Steland (2002a), where upper bounds for the asymptotic normed 
delay are established. For a Bayesian view on the asymptotic normed delay and optimal 
prior choice see Steland (2002b). On-line monitoring has been recently reviewed by Antoch 
and Jaruskova (2002) and Frisen (2003). We also refer to Siegmund (1985). 

We shall now explain the asymptotic framework of our approach more detailed. In order 
to evaluate a detection procedure we will consider local alternatives which converge to the 
in-control model as the (effective) sample size of the procedure tends to infinity. Simulta- 
neously, the false-alarm rate will tend to 0. The local nonparametric alternatives studied 
here are given as a parameterized family of drift functions, 

m{t) = m{t; h) = l{t > t;)mo{[t - tl]/h), 
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where t* E T stands for the change-point assumed to be fixed but unknown, and h > 
is a bandwidth parameter of the detection procedure introduced below determining the 
amount of past data used by the procedure. We assume h E 7i for some countable and 
unbounded set "H C Mq . mo denotes the generic model alternative inducing the sequence 
of local alternatives. We assume that mo is a piecewise Lipschitz continuous function. Our 
asymptotics will assume h — )■ oo. Consequently, for each fixed t G T we have mo(t; h) — )■ 
mo(0), as /i — )■ cxD, if mo is continuous in 0. In this sense, m(t; h) defines a sequence of of local 
alternative if mo(0) = 0. As we shall see below, h coincides with the bandwidth parameter 
determining the sample size of the kernel smoother on which the sequential detection 
procedure is based on. It turns out that the rate of convergence of the local alternative has 
to be related to the bandwidth parameter in this fashion to obtain a meaningful convergence 
result. 

Assume the process is sampled at a sequence of fixed ordered time points, {t„ : n G N}, 
inducing a sequence of observations {F„ : n G N}. Put rrinh = fnitn', h) and = ?(tn) to 
obtain 

Ynh = mnh + en, (nGN). 
Let q denote the integer ensuring tg = [t*\ + 1. Then 

rrinh = l(tn > tg)mo{[tn - tg\/h), neN. 

We do not assume that the time design becomes dense in some sense. In contrary, we 
use time points having a fixed minimal distance, and for simplicity we shall assume tn = n 
for all nGN. More general time designs will be discussed at the end of Section [2l 

The organization of the paper is as follows. Section [1] provides basic notation, assumptions, 
and the definition of the kernel detection procedure. The limit theorem for the normed 
delay is established in Section |2J The result holds for a wide class of generic alternatives 
satisfying a mild integrability condition, provided that the smoothing kernel is Lipschitz 
continuous. Section [3] provides the result on the optimal kernel choice which minimizes 
the asymptotic normed delay. We provide both an existence theorem and a stronger rep- 
resentation theorem. Due to the close relationship of the optimal kernel and the generic 
alternative, this results requires both the kernel and the regression alternative to be con- 
tinuous. We illustrate the results by a couple of examples. 
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1. Sequential kernel detection and assumptions 



We consider the following sequential Priestley-Chao type kernel smoother 

n 

(1) rhnh = ^ Kh{ti - tn)Yih, 

1=1 

n & N. We call rhnh a sequential smoother, since at the n-th time point the Priestley- 
Chao type estimator t i-)- Yli^hiU — t)Yih is only evaluated for t — tn- Here and in the 
sequel Kh{z) — K{z/h)/h denotes the rescaled version of a smoothing kernel K required 
to be a centered, symmetric, and Lipschitz continuous probability density. The associated 
{0, l}-valued sequential decision rule is given by 

(2) dnh = mrhnh] > c), 

or, dnh — > c) (one-sided version), i.e., a signal is given if (the absolute value of) 

rhnh exceeds a prespecified non-negative threshold c. 

The corresponding stopping time is given by 

Nh = inf {n en:dnh = l} 

with inf = oo. In addition, define the normed delay 

Ph = msix{Nh - tq, 0}/h. 

If the kernel vanishes outside the interval [—1, 1], the effective sample size of the detection 
procedure is equal to h. Then ph is simply the delay expressed as a percentage of the 
effective sample size. 

In this paper we will measure the efficiency of a decision procedure by the asymptotic 
behavior of its associated normed delay. We confine ourselves to stopping times meaning 
that decisions at time n only depend on . . . , y„. 

Throughout the paper we shall assume that {cn} is a stationary a mixing process in discrete 
time N. Recall that a mixing (strongly mixing) means that a{k) 0, ii k ^ oo, where 
a{k) denotes the a-mixing coefficient defined by 

a{k)= sup \P{AnB) - P{A)P{B)\. 

Here J-'l. = a{ek, ...,£;) stands for the cr-field induced by the random variables Sk, ■ ■ ■ , £i, 

— oo < k < I < oo. Recall that a-mixing is a weak notion of dependence which is implied by 
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^- and p-mixing. For a general discussion of mixing coefficients and related limit theorems 
we refer to Bosq (1996). The regularity assumptions on the mixing coefficients of {£„} will 
be given later. In addition, we assume {£„} satisfies Cramer's condition, i.e., 

^e'^il^il < oo 

for some positive constant ci. 

The smoothing kernel K used to define the weighting scheme is taken from the class 

/C = j/r : M ^ [0, oo) : j K{s) ds = 1, K{s) = i^(-s)| n Lip 

of all symmetric probability densities on the real line which are Lipschitz continuous, i.e., 
there exists a Lipschitz constant Lk ensuring 

\K{zi) - K{Z2)\ <Lk\zi-Z21 {zi,Z2eR). 

For our optimality results we will have to impose further conditions which will restrict the 
class JC. 

Finally, we also need the following conditions. It is assumed that mo : [0, oo) — >^ M is 
non-negative and satisfies, jointly with K & JC, the following integr ability condition. 



I K{s — x)mo{s) ds 
Jo 



< oo (Vx > 0). 

2. ASYMPTOTICS FOR THE NORMED DELAY 



In this section wc establish both an assertion about the in-control false-alarm rate and 
a limit theorem for the normed delay for general local nonparametric alternatives under 
dependent sampling. 

We need the following specialized large deviation result for the control statistic fhnh- A 
related large deviation result for (unweighted) sums of random variables satisfying Cramer's 
condition can be found in Bosq (1996, Th. 1.4). For our purposes we need the following 
specialised version for mixing time series. 

Define Snh — Yl^=i ^{[ti ~ tn]/h)€i, n E h e H. For two real sequences (a^j) and (6^) 
with bh ^ for sufficiently large h, we write Uh '-^ bh ii ah/bh ^ 1, as /i — >■ oo and ah ~ bh 
up to a constant ii Uh/bh ^ c, /i — >■ oo, for some constant c. 

Theorem 2.1. Assume n/h ( with < ^ < oo. Then the following assertions hold true. 



(i) For each x > 



P{Sr,h >xh) = (^-^e-"^^^ + 0{na{Vh)) + 0{Vhe~''^) = o(l), 



as h ^ oo, provided limfc_>oo k'^a^k) = 0. 
(ii) // J2k k'^C({k) < oo, then for each x > 



P{Snh > Xh) < oo 



implying 

P{Snh > xh,i.o.) = 0. 

Remark 2.1. By construction of the stopping rule, Theorem \2.1\ also makes an assertion 
about the in-control false-alarm rate. Note that our setting implies that the rate converges 
to 0, as the effective sample size h tends to infinity. 



Proof. Put Sh = Snh- Fix < 7 < 1. Note that n/h (. Partition the set {1, . . . ,n} in 
blocks of length l{h) = [{(hy/^i] yielding b{h) = ln/l{h)\ blocks. Note that l{h) ~ h^/^ 
and b{h) ~ h^^"^ up to constants. We have 

lih) 

^ Sl^^ + Rh 

r=l 
b(h) 

^ -f^([46(h)+r — tn]/h)ekb{h)+r 
k=l 

n 

K{\t,-tn]/h)e,. 

i=b{h)l(h)+l 

W.l.o.g. we can assume b{h)l{h) = n, since P[Rh > xh] = 0{l{h)e~'^^^) for some constant 
Ci > 0. Next observe that 

bih) 

P[Sh > xh] < Y,P[SV > {xh)/b{h)]. 

r=l 
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Sh 



(r) 



Rh 



Markov's inequality, Cramer's condition, the strong mixing property, and Volonski and 
Rosanov (1959) provide for each r = 1, . . . , l{h) 



_ ^-t{xh)/b{h) J-j- ^^tK{[tkl(^)+r-tn]/h])ek 



k=l 



< 16{b{h) - l)a{l{h)). 



It is well-known that Cramer's condition holds iff. there are constants g > and T > 
such that Ee^^^ < e^*' for all \t\ < T emd g > {l/2)Eel (Petrov (1975), Lemma III.5). 
Thus, 



k=l 



b(h) 



< expl gt^ ^ K{{tki{h)+r - tn]/h) - txh/h{h) 

^ k=l 

Minimizing the r.h.s. w.r.t. t gives the upper bound 

ef-W2^)' i^h)/b{h)<gTC{r) 
e(-^i), {xh)/b{h)>gTC{r) 

where C(r) = Yl''J'=i ^{[tki(h)+r ~ ^n]/^)^- Observe that the timepoints 



tkl{h)+r: k = 1,. . b{h) 

form an equidistant partition of an interval converging to {—(, 0). The size of the partition 
equals l{h)/h ~ h~^/'^ up to a constant. Therefore, using j^^K{sY ds = K{s)'^ ds, 



k=l 



o 



m 
h 



Consequently, [1(h) /h)] ^C(r) is bounded away from for large enough h. Thus, uniformly 
in r = 1, . . . , b{h), 



P[S'f[> > {xh)/b{h)] = Oie-"'" ' ) + 0{h}'^a{h^'^)), 

10 



for some constant c > 0, yielding 

m 

P[Sh>xh] < J2P[Sj[^>{xh)/b{h)]+P[Rh>x] 

r=l 



0{b{h)e-''''^') + 0{h{h)h^'^a{h^'^)) + 0{l{h)e-''^) 



Therefore, the mixing condition 



hm k'^aik) = 

k—^oo 



ensures 

P[Sh > xh] = 0(1), as /i — )■ 00. 
Finally, the above estimates and 

^k'^a{k) < 00 

k 

yield J2hen P[^h > xh] < 00, and an application of Borel-Cantelli provides 

P[Sh > xh i.o] = 0. 



□ 



We may now formulate our main result on the strong law of large numbers for the normed 
delay. Define 

(3) Po = inf |p > : J K{s — p)mo{s)ds = c 

Theorem 2.2. Let K & K, be a given kernel and tuq be a piecewise Lipschitz continuous 
generic alternative with mo(0) = and either mo > or nio < such that ^ exists and 
< po < 00. Then 

a.s. 

Ph Po, 

as h ^ 00, provided that k'^a^k) < 00. 



Remark 2.2. The proof even shows complete convergence. 
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Proof. W.l.o.g. we assume mo > 0. Let e > 0. We shall estimate P[ph — Po > £] and 
P[ph < Po — £]■ Put n{h) = [(po + £)h\ . Then we have 

P[ph-po>€] = P[Nh> {po + e)h] 



< PWfh. 



n{h),h I 



<cl 



P 



< P 



n(h) 

"^KhiU - tn)[mih + ei 

i=l 



n{h) 

^Kh{ti - tn)ti 
i=l 



> C 



< c 



n{h) 
i=q 



where in the last step we used the fact that Yih = Si ii 1 < i < q = tq and Yih = 
^o{[ti — tq]/h) + Cj if g < z < n{h). For the following argument we may assume that rriQ is 
Lipschitz continuous, since otherwise one may argue on subintervals. We have 



n{h) 
i=q 



n(h) 



/i-i^K([z-n(/i)]//i)mo([2-g]//i) 

i=q 
n{h) 

h-^ J2 K{i/h - Po - e)mo{i/h) + 0{h) 

i=q 
rpo+e 

I K{s - Po- e)mo{s)ds + 0{h'^) 
Jo 



po 



-1\ 



K{s - po)mo{s)ds + 0(e) + 0{h 



since q/h and n{h) /h ^ Po+^, as h ^ oo, and K is Lipschitz continuous. Recalling the 
definition of po, there exists a constant k > 0, which depends on e, with c— | ^2^=^ ^hiU — 



tn)'mih\ > K > 0, yielding 

P[ph - Po> £]< P[Snih) > i^h]. 
We may now apply Theorem 12.11 (ii) with = po + e to conclude that 

P{ph - Po > e) < oo. 

h 

To estimate P[ph < Po — e] note that 



{ph <po-e} = {q<Nh<q+ [(po + E)h\ }, 
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since Nh > q on {ph < po — e} hj definition of ph- We have 
P[q < N,, < q + [{po - e)h\] 



< P 



max \Tnkh\ > c 

q<k<q+l(po-e)h\ 



k=q 



> C 



q+l(po-£)h\ 

KhiU - tk)mih 

i=q 



^Khiti - tk)ei 

k=q i=l 

First note that 

^ rpo 

c-22^h{ti-tk)mih^ / K{s - po)mo{s) ds > 0. 
i=i J PO-e 

Further, q < n < q + [(po — imphes n/ h ^ Pq — e , a.s n, h ^ oo. An apphcation of 
Theorem 12.11 (i) to each summand with n = k, noting that there are 0{h) summands and, 
of course, k = 0{h), we see that 

P[q < Nh < q + [{po - e)h\] = O^h^fhe^^^-^) + 0{h^a{^fh)) + 0{hVhe-^^^) 

= o{l\ 

as /i — > oo, provided k^a{k) = o(l). Further, k'^a{k) < oo imphes 

P[Ph - Po < -e] < oo 

h 

yielding complete convergence, 

^ P[\ph - po\> e] < oo, for every e > 0, 

h 

which implies a.s. convergence (e.g. Karr (1993), Prop. 5.7). □ 

We close this section with a brief discussion of more general time designs. For some ap- 
plications it may be possible and reasonable to determine at each time point the time 
points . . . , tnn where observations are taken. For example, one may start with monthly 
observations and reduce the distance between successive observations to ensure that the 
most recent data points are daily measurements. Note that such a thinning effect can not 
be obtained by a smoothing kernel. 
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Remark 2.3. Assume Ft is a d.f. with support [0, 1] possessing a density fx- Suppose at 
the n-th time point we may select the time points where observations are taken. We assume 
that 

tni = nFj:^{i/n), i = l,...,n. 

Clearly, the choice tni = i corresponds to the uniform distribution Ft{s) = s, s G [0,1]. 
When using skewed time designs, we can ensure that more recent observations dominate 



the sample of size n. It is straightforward to check that the proofs of Theorem \2.1\ and 
Theorem \2.i^ also work for that choice of time points. In this case we obtain 

Y,Kh{tni-tnn)m{tni/h) ^ / K {C{F^\s / Q - l))mo{F^\s / Q) ds 

= C [' K{as-l))mois)fT{s)ds, 



if n/h (, yielding 

Ph inf |p > : K{p{s - l))mo(s)/r(s) ds = c| 
as h ^ CO, under the conditions of Theorem \2.Si 



3. Optimal kernels 

The result of the previous section suggests the following kernel selection procedure. Suppose 
we are given a finite set {Ki, . . . , Km} of candidate kernels. Then we may choose the kernel 
which minimizes the corresponding asymptotic normed delay p*. For an example where this 
selection rule was successfully applied to a real data set see Steland (2002c). 

However, the natural question arises how to optimize the asymptotic normed delay with 
respect to the smoothing kernel K. It turns out that for the setting studied in this paper 
a meaningful result can be obtained. The canonical solution of the functional optimization 
problem can be given explicitly for arguments not exceeding the asymptotic optimal delay. 
Indeed, the optimal kernel is equal to a composition of the generic alternative and a time- 
reversal transformation which depends on the optimal asymptotic normed delay. 

Although in this paper we assume that m is continuous at t = tg, let us briefly discuss 
the discontinuous classical change-point model given by m{s) = al{s > tg),a > for all 
s E T. Brodsky and Darkhovsky (1993, Th. 4.2.8) have shown that the normed delay of 
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the regular stopping rule ([2]) converges with probability 1, i.e., ph po, as /?.—;• oo, where 
the constant po is given by 



Po = inf |p : j K{s)ds = c/aj- . 



If Fk{z) = K{s) ds, z G M, denotes the associated distribution function, we have the 
explicit solution po{K) = F^^{l/2 — c/a). It is easy to show that for every c > there 
exists a symmetric kernel with unit variance and bounded support such that the functional 
Po vanishes. 

Therefore, in the sequel we assume that mg is a non-constant function. It will turn out that 
we now obtain solutions with non-vanishing optimal asymptotic normed delay. This allows 
to define an ordering relation on the set of admissible generic alternatives by comparing 
the optimal asymptotic normed delays. Anticipating the relationship between the optimal 
kernel and mo, we assume that mo is Lipschitz continuous. For the optimality result of this 
section we also have to assume the following stronger regularity assumptions on the class 
of admissible kernels. 

(Kl) /C is a class of uniformly Lipschitz continuous probability densities with Lipschitz 
constant L, i.e., 

sup \K{zi) - K{z2)\ < L\zi - Z2\ {zi,Z2 e M). 
(K2) The class /C is uniformly bounded, i.e., 

||/C||oo = sup lli^lloo <Ck.<oo 

KeK. 

holds true for some constant C/c- 
Define the mapping / : /C x [0, oo) — )■ M, 

I{K,p)=f K{s - p)Tno{s)ds, 
Jo 

and denote by 

nip) = {I{K,p):Ke}C} 

the reachable set at time p. It is clear that TZ{p) is closed when K, is equipped with the 
uniform topology induced by the supnorm. 
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Definition 3.1. A pair {K*,p*) e K, x [0, oo) is optimal, if 



p* = inf{p : c e 7^(p)} 



and K* ensures that 



I{K*,p*) = c. 



For fixed K ^ JC define 



^^/{p)=I{K,p)= / K{s-p)mo{s)ds 







We will assume that there exists a positive G M such that \E' is a strictly increasing 
function on [0, R). Note that \E' is continuous since K is Lipschitz continuous by assumption. 
We have the following theorem on the existence of optimal kernels. 

Theorem 3.1. Assume there exists Ki ^ K. and some pi > with 



where p* = inf{p : c G 7^(p)} is the optimal asymptotic normed delay. 

Remark 3.1. For many generic alternatives mo it should be a trivial task to verify the 
condition of Theorem \3.1\ holds true. 

Proof. By assumption we have c G 7l{pi). Let p* = inf{p : c G 7l{p)}. Then < p* < pi. 
We shall show c G TZ{p*). Then, by definition of TZ{p*), there exists an optimal kernel 
K* G /C with I{K*,p*) = c. c E 7?.(p*) is a consequence of the following continuity 
argument. There exists a non-increasing sequence {p„} with p„ — )• p* and an associated 
sequence {Kn} C /C with 




Then there exists an optimal kernel K* G K,, i.e. 



I{K*,p*) = c 



I{Kn,Pn) = C G 7^(p„). 
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Since G /C, I{Kn,p*) G n{p*). We have 

\I{Kr,,p*)-I{K^,Pn)\ 

rp* rPn 
= / Kn{s - p*)mo{s)ds - / Kn{s - pn)mo{s)d& 
Jo Jo 

[K„{s - p*) - K{s - p„)]mo{s) ds 
Kn{s - Pn)'mo{s) ds 

PP* 

< L\p* - Pn\ mo{s) ds 
Jo 



+ 



Kl{s - pn) ds 



L-J p 



- 1/2 


rpn 


1/2 




/ ml{s) ds 






Jp* 





Clearly, by assumption (K2), J^^ K^is — pn) ds < C'^\p* — Therefore, 

|7(X„,p*)-7(X„,p„)|=o(l), 

as n — 7> oo, yielding I{Kn,p*) c, as n ^ oo. Since I{Kn,p*) G 'R-{p*) for all n and since 
7^(p*) is closed, we obtain 

c= lim I{Kn,p*) en{p*). 

n—^oo 

□ 

The following Lemma provides an useful characterization of each optimal pair {K*, p*) and 
is crucial to calculate optimal kernels. 

Lemma 3.1. Assume {p*,K*) e [0, i?] x JC is optimal. Then 

p* rp* 

K*{s — p*)mo{s) ds = sup / K (p* — s)mo{s) ds 
Keic Jo 

Proof. Assume there exists some K E IC with 
fp* ~ fp* 

/ K {p* — s)mo{s) ds > / K* (p* — s)mo{s) ds — c. 
Jo Jo 

Since p i— )■ K{p — s)mo{s) ds is strictly increasing and continuous for p e [0,p*], there 
exists a p** with p** < p* such that 

"p** ^ 

K{p** — s)mo{s) ds = c, 
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implying that the pair (p*, K*) is not optimal which is a contradiction. 



□ 



We are now in a position to formulate and prove the following result about the explicit 
representation of the optimal kernel for a given generic alternative. 

Theorem 3.2. In addition to the regularity assumptions of this section assume 



< 



\ ds < oo 



and that the set 



poo 

/ rno{s) 
Jo 

Jq mo(s) ds 

is non-empty. Then the following conclusions hold true. 



(i) The optimal asymptotic normed delay is given by p* — inf S 

(ii) The optimal smoothing kernel K* satisfies 

mo{p* - \z\) 



K*(z] 



ze[-p ,p 



2 J^mo{s) ds' 

Proof. Let X e /C be an arbitrary candidate kernel. By the Cauchy-Schwarz inequality we 

1/2 



.1/2 






/ ml{s) ds 




Jo 



have 



with equality if and only if 

K{p* -s) = X-mo{s), Vse[0,p*] 
for some constant A e R. Since K{s)ds — 1/2 and mo{s) = if s < 0, 

/•oo 

= 2 mo{s) ds. 
Jo 

Therefore, since /C is a symmetric class, 

mo{p* - \s\) 



Finally, we obtain 



K*(s) 



2 /o°°mo(s) ds' 



K*{p* — s)mo(s) ds 



se[-p*,p*] 



lo ml{s)ds 
2 Jq°° mo(s) ds 



□ 
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4. Examples 



Let us consider some special cases to illustrate the results. 
Example 4.1. For a truncated linear drift, 

mo{t) = atlio^T]{t), t > 0, 



we obtain p* = ^- and 



Example 4.2. Assume the generic alternative is given by a truncated exponential drift 

mo{t) = e^%,T]{t),t>0, 
for some A e where T is a positive truncation constant. IfT> p* , we have 

^ ^ ff* ml{s) ds ^ ^ ^ 
mo{s) ds 

Hence, the optimal asymptotic normed delay is given by 

ln(c - 1) 

The optimal kernel K* is given by 

Note that K* converges to the density of the Laplace distribution, (A/2)e~'**l^', if p* = T ^ 
oo. Hence, for exponential drifts exponential weighting schemes are asymptotically optimal 
in this sense. 

Example 4.3. Usually, enzyme processes are described by the Michaelis-Menten frame- 
work. Exploiting the quasi- steady -state approximations, the enzyme kinetic can be summa- 
rized by the differential equation 

d[S] _ __iw[5T_ 
dt ' Km + [S] 

with initial condition [iS'](0) = [Sq\, where [S]{t) stands for the substrate concentration at 
time t. Km denotes the Michaelis-Menten rate constant, and Vmax is the maximal velocity. 

19 



For further details we refer to Schnell and Mendoza (1997). The solution of the differential 
equation is given by 



where W stands for Euler's omega function, the (principal branch of the) inverse of the 
function x i->- xexp(x) (Euler (1777), Corless et al. (1996)). The optimal kernel to detect 
the generic alternative 

mo{t) = {So-[S]){t)l^o,T]{t) 

is given by 

K*{z)^{So-[S]){p* -\z\)/Cs, \z\<p\ 
Observing that £ W{deH) dt = j2xa W{y)y-^ dy, d, A G and using the formulas 

W{y)/ydy = {yf[2> + 2W {y)] 



j\{yf/ydy = Iw {y)[2 + ?,W {y)] 



one may obtain explicit formulas for Cs — 2 J^(So — [S])(t) dt and for the enumerator and 
denominator of the nonlinear equation 

mo{s)ds 
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